joel selanikio the surprising seeds of a big data revolution in healthcare

There's an old joke about a cop who's walking his beat in the middle of the night, and he comes across a guy under a street lamp who's looking at the ground and moving from side to side, and the cop asks him what he's doing. The guys says he's looking for his keys. So the cop takes his time and looks over and kind of makes a little matrix and looks for about two, three minutes. No keys.

The cop says, "Are you sure? Hey buddy, are you sure you lost your keys here?"

And the guy says, "No, no, actually I lost them down at the other end of the street, but the light is better here."

There's a concept that people talk about nowadays called big data, and what they're talking about is all of the information that we're generating through our interaction with and over the Internet, everything from Facebook and Twitter to music downloads, movies, streaming, all this kind of stuff, the live streaming of TED. And the folks who work with big data, for them, they talk about that their biggest problem is we have so much information, the biggest problem is, how do we organize all that information?

I can tell you that working in global health, that is not our biggest problem. Because for us, even though the light is better on the Internet, the data that would help us solve the problems we're trying to solve is not actually present on the Internet. So we don't know, for example, how many people right now are being affected by disasters or by conflict situations. We don't know for really basically any of the clinics in the developing world, which ones have medicines and which ones don't. We have no idea of what the supply chain is for those clinics. We don't know—and this is really amazing to me—we don't know how many children were born, or how many children there are in Bolivia or Botswana or Bhutan. We don't know how many kids died last week in any of those countries. We don't know the needs of the elderly, the mentally ill. For all of these different critically important problems or critically important areas that we want to solve problems in, we basically know nothing at all.

And part of the reason why we don't know anything at all is that the information technology systems that we use in global health to find the data to solve these problems is what you see here. And this is about a 5,000-year-old technology. Some of you may have used it before. It's kind of on its way out now, but we still use it for 99 percent of our stuff. This is a paper form, and what you're looking at is a paper form in the hand of a Ministry of Health nurse in Indonesia who is tramping out across the countryside in Indonesia on, I'm sure, a very hot and humid day, and she is going to be knocking on thousands of doors over a period of weeks or months, knocking on the doors and saying, "Excuse me, we'd like to ask you some questions. Do you have any children? Were your children vaccinated?" Because the only way we can actually find out how many children were vaccinated in the country of Indonesia, what percentage were vaccinated, is actually not on the Internet but by going out and knocking on doors, sometimes tens of thousands of doors. Sometimes it takes months to even years to do something like this. You know, a census of Indonesia would probably take two years to accomplish.

And the problem, of course, with all of this is that with all those paper forms—and I'm telling you we have paper forms for every possible thing. We have paper forms for vaccination surveys. We have paper forms to track people who come into clinics. We have paper forms to track drug supplies, blood supplies, all these different paper forms for many different topics, they all have a single common endpoint, and the common endpoint looks something like this. And what we're looking at here is a truckful o' data. This is the data from a single vaccination coverage survey in a single district in the country of Zambia from a few years ago that I participated in. The only thing anyone was trying to find out is what percentage of Zambian children are vaccinated, and this is the data, collected on paper over weeks from a single district, which is something like a county in the United States. You can imagine that, for the entire country of Zambia, answering just that single question looks something like this. Truck after truck after truck filled with stack after stack after stack of data.

And what makes it even worse is that that's just the beginning, because once you've collected all that data, of course someone's going to have to—some unfortunate person is going to have to type that into a computer. When I was a graduate student, I actually was that unfortunate person sometimes. I can tell you, I often wasn't really paying attention. I probably made a lot of mistakes when I did it that no one ever discovered, so data quality goes down.

But eventually that data hopefully gets typed into a computer, and someone can begin to analyze it, and once they have an analysis and a report, hopefully then you can take the results of that data collection and use it to vaccinate children better.

Because if there's anything worse in the field of global public health, I don't know what's worse than allowing children on this planet to die of vaccine-preventable diseases, diseases for which the vaccine costs a dollar. And millions of children die of these diseases every year. And the fact is, millions is a gross estimate because we don't really know how many kids die each year of this.

What makes it even more frustrating is that the data entry part, the part that I used to do as a grad student, can take sometimes six months. Sometimes it can take two years to type that information into a computer, and sometimes, actually not infrequently, it actually never happens. Now try and wrap your head around that for a second. You just had teams of hundreds of people. They went out into the field to answer a particular question. You probably spent hundreds of thousands of dollars on fuel and photocopying and per diem, and then for some reason, momentum is lost or there's no money left, and all of that comes to nothing because no one actually types it into the computer at all. The process just stops. Happens all the time. This is what we base our decisions on in global health: little data, old data, no data.

So back in 1995, I began to think about ways in which we could improve this process. Now 1995, obviously that was quite a long time ago. It kind of frightens me to think of how long ago that was. The top movie of the year was "Die Hard with a Vengeance." As you can see, Bruce Willis had a lot more hair back then. I was working in the Centers for Disease Control, and I had a lot more hair back then as well.

But to me, the most significant thing that I saw in 1995 was this. Hard for us to imagine, but in 1995, this was the ultimate elite mobile device. Right? It wasn't an iPhone. It wasn't a Galaxy phone. It was a Palm Pilot. And when I saw the Palm Pilot for the first time, I thought, why can't we put the forms on these Palm Pilots and go out into the field just carrying one Palm Pilot, which can hold the capacity of tens of thousands of paper forms? Why don't we try to do that? Because if we can do that, if we can actually just collect the data electronically, digitally, from the very beginning, we can just put a shortcut right through that whole process of typing, of having somebody type that stuff into the computer. We can skip straight to the analysis and then straight to the use of the data to actually save lives.

So that's actually what I began to do. Working at CDC, I began to travel to different programs around the world and to train them in using Palm Pilots to do data collection instead of using paper. And it actually worked great. It worked exactly as well as anybody would have predicted. What do you know? Digital data collection is actually more efficient than collecting on paper. While I was doing it, my business partner, Rose, who's here with her husband, Matthew, here in the audience, Rose was out doing similar stuff for the American Red Cross.

The problem was, after a few years of doing that, I realized I had done—I had been to maybe six or seven programs, and I thought, you know, if I keep this up at this pace, over my whole career, maybe I'm going to go to maybe 20 or 30 programs. But the problem is, 20 or 30 programs, like, training 20 or 30 programs to use this technology, that is a tiny drop in the bucket. The demand for this, the need for data to run better programs, just within health, not to mention all of the other fields in developing countries, is enormous. There are millions and millions and millions of programs, millions of clinics that need to track drugs, millions of vaccine programs. There are schools that need to track attendance. There are all these different things for us to get the data that we need to do. And I realized, if I kept up the way that I was doing, I was basically hardly going to make any impact by the end of my career.

And so I began to wrack my brain trying to think about, you know, what was the process that I was doing, how was I training folks, and what were the bottlenecks and what were the obstacles to doing it faster and to doing it more efficiently? And unfortunately, after thinking about this for some time, I realized—I identified the main obstacle. And the main obstacle, it turned out, and this is a sad realization, the main obstacle was me.

So what do I mean by that? I had developed a process whereby I was the center of the universe of this technology. If you wanted to use this technology, you had to get in touch with me. That means you had to know I existed. Then you had to find the money to pay for me to fly out to your country and the money to pay for my hotel and my per diem and my daily rate. So you could be talking about 10,000 or 20,000 or 30,000 dollars if I actually had the time or it fit my schedule and I wasn't on vacation. The point is that anything, any system that depends on a single human being or two or three or five human beings, it just doesn't scale. And this is a problem for which we need to scale this technology and we need to scale it now.

And so I began to think of ways in which I could basically take myself out of the picture. And, you know, I was thinking, how could I take myself out of the picture for quite some time. You know, I'd been trained that the way that you distribute technology within international development is always consultant-based. It's always guys that look pretty much like me flying from countries that look pretty much like this to other countries with people with darker skin. And you go out there, and you spend money on airfare and you spend time and you spend per diem and you spend [on a] hotel and you spend all that stuff. As far as I knew, that was the only way you could distribute technology, and I couldn't figure out a way around it.

But the miracle that happened, I'm going to call it Hotmail for short. Now you may not think of Hotmail as being miraculous, but for me it was miraculous, because I noticed, just as I was wrestling with this problem, I was working in sub-Saharan Africa mostly at the time. I noticed that every sub-Saharan African health worker that I was working with had a Hotmail account. And I thought, it struck me, wait a minute, I know that the Hotmail people surely didn't fly to the Ministry of Health of Kenya to train people in how to use Hotmail. So these guys are distributing technology. They're getting software capacity out there but they're not actually flying around the world. I need to think about this some more. While I was thinking about it, people started using even more things just like this, just as we were. They started using LinkedIn and Flickr and Gmail and Google Maps, all these things. Of course, all of these things are cloud-based and don't require any training. They don't require any programmers. They don't require any consultants, because the business model for all these businesses requires that something be so simple we can use it ourselves with little or no training. You just have to hear about it and go to the website.

And so I thought, what would happen if we built software to do what I'd been consulting in? Instead of training people how to put forms onto mobile devices, let's create software that lets them do it themselves with no training and without me being involved? And that's exactly what we did.

So we created software called Magpi, which has an online form creator. No one has to speak to me. You just have to hear about it and go to the website. You can create forms, and once you've created the forms, you push them to a variety of common mobile phones. Obviously nowadays, we've moved past Palm Pilots to mobile phones. And it doesn't have to be a smartphone. It can be a basic phone like the phone on the right there, you know, the basic kind of Symbian phone that's very common in developing countries. And the great part about this is, it's just like Hotmail. It's cloud-based, and it doesn't require any training, programming, consultants.

But there are some additional benefits as well. Now we knew, when we built this system, the whole point of it, just like with the Palm Pilots, was that you'd have to, you'd be able to collect the data and immediately upload the data and get your data set. But what we found, of course, since it's already on a computer, we can deliver instant maps and analysis and graphing. We can take a process that took two years and compress that down to the space of five minutes. Unbelievable improvements in efficiency. Cloud-based, no training, no consultants, no me. And I told you that in the first few years of trying to do this the old-fashioned way, going out to each country, we reached about, I don't know, probably trained about 1,000 people. What happened after we did this? In the second three years, we had 14,000 people find the website, sign up, and start using it to collect data, data for disaster response, Canadian pig farmers tracking pig disease and pig herds, people tracking drug supplies.

One of my favorite examples, the IRC, International Rescue Committee, they have a program where semi-literate midwives using $10 mobile phones send a text message using our software once a week with the number of births and the number of deaths, which gives IRC something that no one in global health has ever had: a near real-time system of counting babies, of knowing how many kids are born, of knowing how many children there are in Sierra Leone, which is the country where this is happening, and knowing how many children die.

Physicians for Human Rights—this is moving a little bit outside the health field—they are gathering, they're basically training people to do rape exams in Congo, where this is an epidemic, a horrible epidemic, and they're using our software to document the evidence they find, including photographically, so that they can bring the perpetrators to justice.

Camfed, another charity based out of the U.K., Camfed pays girls' families to keep them in school. They understand this is the most significant intervention they can make. They used to track the dispersements, the attendance, the grades, on paper. The turnaround time between a teacher writing down grades or attendance and getting that into a report was about two to three years. Now it's real time, and because this is such a low-cost system and based in the cloud, it costs, for the entire five countries that Camfed runs this in with tens of thousands of girls, the whole cost combined is 10,000 dollars a year. That's less than I used to get just traveling out for two weeks to do a consultation.

So I told you before that when we were doing it the old-fashioned way, I realized all of our work was really adding up to just a drop in the bucket—10, 20, 30 different programs. We've made a lot of progress, but I recognize that right now, even the work that we've done with 14,000 people using this, is still a drop in the bucket. But something's changed. And I think it should be obvious. What's changed now is, instead of having a program in which we're scaling at such a slow rate that we can never reach all the people who need us, we've made it unnecessary for people to get reached by us. We've created a tool that lets programs keep kids in school, track the number of babies that are born and the number of babies that die, to catch criminals and successfully prosecute them, to do all these different things to learn more about what's going on, to understand more, to see more, and to save lives and improve lives.

Thank you.

(Applause)